Skip to content

feat: add dialog context handling and omit empty text results configuration#2158

Merged
wangyoucao577 merged 3 commits into
mainfrom
feature/yuyidi/update_bytedance_asr
May 12, 2026
Merged

feat: add dialog context handling and omit empty text results configuration#2158
wangyoucao577 merged 3 commits into
mainfrom
feature/yuyidi/update_bytedance_asr

Conversation

@diyuyi-agora
Copy link
Copy Markdown
Contributor

No description provided.

@github-actions
Copy link
Copy Markdown

PR Review — #2158

Reviewed: feat: add dialog context handling and omit empty text results configuration

Overall this is a well-scoped change with thoughtful defensive coding in the nested path setter. Below are the concerns worth addressing before merge.

Correctness / behavior

  1. update_configs mutates config.params before tearing down the old connection (extension.py _run_update_configs).
    The sequence is: mutate → stop_connection()start_connection(). Because stop_connection() is awaited, other coroutines (e.g. anything triggered by the current websocket loop) can observe the new params while the old WebSocket session is still winding down. Safer order:

    await self.stop_connection()
    _apply_string_field_under_params(params, ...)
    await self.start_connection()
    

    This also makes rollback-on-start_connection-failure cleaner.

  2. Dead sent assignment in the not utterances fallback (_on_asr_result):

    sent = await self._send_asr_result_from_text(
        text=result.text, is_final=False, ...
    )
    return

    sent is never read (and is_final=False means it wouldn't drive has_final_result anyway). Drop the sent =.

  3. parent_not_dict branch is effectively unreachable for multi-segment paths in _apply_string_field_under_params: once the loop runs, cur is always a dict (or we return early). It fires only for a single-segment path where the caller passed a non-dict params, which the caller already guards against. Not harmful — just dead defense. The docstring's invariant ("params is left unchanged for that traversal") does hold since we only auto-create intermediates when keys are missing, and freshly-created empty dicts cannot then contain non-dict values.

Security

  1. No allowlist on field_path. update_configs can rewrite any key under params, including access_key, app_key, api_key, api_url. If this cmd is reachable from untrusted edges, this is a credential-rotation / exfiltration primitive. Consider either (a) an allowlist of modifiable paths (e.g. omit_empty_text_results, request.*) or (b) explicitly document the trust boundary and deny mutation of secret fields.

  2. _send_full_client_request logs the request section at info level (volcengine_asr_client.py). The request block can carry corpus.context / hot-words / user-provided prompt text that may include PII. to_json(sensitive_handling=True) was carefully designed to redact — the new log bypasses that path. Suggest log_debug, or selectively redact fields known to carry user content.

  3. Exception leak in on_cmd: message=str(e) on the unexpected-exception path can surface stack-level internals to the caller. Consider message="internal_error" for the cmd result and keep the detailed str(e) only in the log line.

Tests

  1. Coverage is thin for the new surface area. test_apply_string_field_under_params.py covers only the helper. Worth adding:
    • omit_empty_text_results=True — assert _send_asr_result_from_text returns False and no send_asr_result is dispatched; also check that an empty final result does not flip has_final_result (it previously did — this is a meaningful semantic change).
    • on_cmd update_configs — bad JSON, missing schema_version, non-string value, non-dict payload, unknown field path (after allowlisting, if added). Asserting both StatusCode and the (code, message) properties.
    • That a successful update_configs actually triggers stop_connection + start_connection once (mock them).

Nits

  1. Version bump 0.3.200.4.1 skips 0.4.0. Intentional?

  2. PR title mentions "dialog context handling" but the code ships a generic nested-path setter (the docstring only references the prior dialog_ctx clear). If dialog context is the motivating use case, fine — but the title oversells specificity. A more accurate phrasing: feat(bytedance_llm_based_asr): add update_configs cmd and omit_empty_text_results.

  3. Per project convention (AGENTS.md), commit subjects are type(scope): lowercase description. The second commit (feat: change update_configs) is vague — consider squashing or clarifying on merge.

  4. _update_configs_lock only serializes update_configs against itself, not against the reconnect machinery that also invokes stop_connection/start_connection on error paths. If that's a concern in practice, the two paths need a shared lock.

What looks good

  • The helper's fail-closed semantics (no silent overwrite of scalars/lists) and the dedicated test for that case.
  • Returning bool from _send_asr_result_from_text and using it to gate has_final_result — correct handling of the new suppression flag.
  • schema_version gating on the cmd payload — good forward-compat hygiene.
  • Using StatusCode.ERROR + (code, message) properties matches the documented ten_ai_base convention.

@diyuyi-agora diyuyi-agora force-pushed the feature/yuyidi/update_bytedance_asr branch from a558861 to d2231b9 Compare May 12, 2026 07:51
@github-actions
Copy link
Copy Markdown

PR Review — #2158

Thanks for the patch! Here's my feedback, ordered by severity.

🔴 Blocking — test imports a function that does not exist

tests/test_apply_string_field_under_params.py imports:

from bytedance_llm_based_asr.extension import _apply_string_field_under_params

But _apply_string_field_under_params is not defined anywhere in this PR or in the current codebase (grep across the extension dir finds zero matches). The test will fail at collection / import time.

Two plausible explanations:

  1. The function was intended to be the replacement for _strip_empty_request_corpus_context (a generic nested-path setter) and got left out of the commit.
  2. The tests were carried over from another branch/PR and are stale.

Please either add the implementation or remove / rewrite the test file. As-is, CI for this extension should be red.

🟠 Test coverage gaps for the new behavior

The only test added references a non-existent helper. The actually new, user-visible behavior has no tests:

  • Config.get_omit_empty_text_results() + the early-return in _send_asr_result_from_text (including the subtle has_final_result interaction — see next item).
  • _deep_merge_dict — edge cases like dict-over-non-dict, non-dict-over-dict, nested empties.
  • _strip_empty_request_corpus_context — empty-string vs missing vs non-dict request/corpus.
  • The update_configs on_cmd handler — happy path, invalid JSON, payload-not-object, config-not-loaded, and lock re-entrancy.

🟢 Nice catch — but worth calling out

The rewrite around has_final_result is correct and subtle: the old code set has_final_result = True before the send, so with omit_empty_text_results=True a dropped final would still trigger the finalize end signal path downstream. The new pattern

sent = await self._send_asr_result_from_text(...)
if is_final and sent:
    has_final_result = True

fixes that. Good. This is exactly the kind of non-obvious invariant that deserves a unit test so it doesn't regress.

🟡 _strip_empty_request_corpus_context — narrowly magic

Special-casing request.corpus.context == "" to mean delete the key is a hidden semantic: callers who legitimately want to send an empty context string can't. It's also coupled to one specific nested path — if another field ever needs the same treatment, this pattern doesn't scale.

Two suggestions:

  • Document this semantic in the update_configs payload contract (if there is one).
  • If a generic "clear this nested key" behavior is needed, a sentinel (e.g. null → delete, \"\" → keep as empty string) is more predictable than overloading \"\".

This also ties back to the missing _apply_string_field_under_params — my guess is that function was meant to be the generic version of this, which would make the special case obsolete.

🟡 Request payload is logged on every connect at INFO

In volcengine_asr_client._send_full_client_request:

request_payload = {\"request\": self.config.get_request_config()}
if self.ten_env:
    self.ten_env.log_info(\"full_client_request params (request section): \" + json.dumps(...))

Two concerns:

  1. Noise / privacy: request.corpus.context may contain dialog history or user-provided prompt text. Logging it at INFO on every (re)connect is a data-exposure risk in shared log aggregators. Consider log_debug, or redact corpus.context.
  2. Drift: request_payload is built for logging only; the actual wire payload is produced by RequestBuilder.new_full_client_request(self.seq, self.config). If those ever diverge, the log becomes misleading. If you want to log the real payload, log what RequestBuilder actually produced.

🟡 update_configs handler — small things

File: extension.py

  1. cast(str, url_val).strip() — unnecessary; isinstance(url_val, str) already narrows the type. Drop the cast and the typing.cast import cost.
  2. All error paths return code=-1. If callers distinguish failure modes programmatically, differentiated codes (invalid JSON vs config-not-loaded vs unexpected exception) are friendlier than string matching on message.
  3. await self.stop_connection() / await self.start_connection() runs under the lock. If start_connection blocks on network, it holds _update_configs_lock for the full reconnect window — subsequent update_configs calls queue behind it. Probably fine (that's the intent), but worth being explicit that this is serialized-reconnect-by-design.
  4. The outer try/except Exception is broad — OK for a command boundary, but logging str(e) without traceback makes post-mortem harder. Consider self.ten_env.log_error(..., exc_info=True) if your logger supports it, or include traceback.format_exc().

🔵 Minor

  • Version bump 0.3.20 → 0.4.2 skips 0.4.0 / 0.4.1. If this is intentional (matching a released artifact elsewhere) great; otherwise 0.4.0 reads more naturally for this feature.
  • PR title says "dialog context handling" but the only dialog-context-related code is the implicit empty-string-clear in _strip_empty_request_corpus_context. A short note in the PR description on the intended update_configs payload shape (what keys, types, semantics of empty values) would help reviewers and future maintainers.
  • _deep_merge_dict mutates target in place and does not deep-copy values from patch. Fine for the current call path (payload is freshly parsed JSON, not retained), but worth a one-line comment so a future caller doesn't get bitten.

Summary

The core feature (omit-empty-text results + runtime reconfig + reconnect) is a clean, useful addition and the has_final_result fix is a real bug squash. Blocking issue is the broken test import — please resolve that (either land the missing _apply_string_field_under_params or remove the test file) and add coverage for the genuinely new behavior before merging.

@diyuyi-agora diyuyi-agora force-pushed the feature/yuyidi/update_bytedance_asr branch from d2231b9 to 393290d Compare May 12, 2026 08:43
@github-actions
Copy link
Copy Markdown

Code Review: PR #2158 — Dialog context handling + omit empty text results

Focused change with good test coverage for the new helpers. A few observations:

Correctness

  • has_final_result semantics shifted. It is now set only when sent is True, so if omit_empty_text_results=true and the final result's text is empty, the "finalize end signal" block no longer fires. For the new opt-in path this is almost certainly desired, but worth calling out in the PR description — it changes when downstream consumers see the end signal.
  • _run_update_configs always reconnects. An empty/no-op payload ({}, or one with only dump) still triggers stop_connection() + start_connection(). Consider short-circuiting when nothing meaningful changed. Compare with cartesia_tts's _requires_reconnection that gates reconnect on specific field diffs.
  • In-flight audio during reconnect. stop_connection/start_connection during active streaming will likely drop audio frames and any server-side session state. No buffering/flush logic is added. If this command is meant to be used mid-session, a brief note on the contract (caller is expected to pause input, or frames are dropped) would help.
  • dump flag. self.config.dump = bool(payload["dump"]) mutates the config, but it's only effective if start_connection() rereads that field to rebuild the Dumper. Worth verifying the new dumper picks up the flipped flag on reconnect.

Design

  • _strip_empty_request_corpus_context is narrowly scoped. Hard-coding the rule to request.corpus.context == "" is brittle — future fields that want "send empty string to clear" will each need a similar helper. Consider a more general convention (e.g. a reserved sentinel like null/None to mean "remove") in a follow-up.
  • _set_update_configs_cmd_result. The comment says "match ten_ai_base cmd_in convention." If that convention exists as a shared helper in ten_ai_base, prefer calling it directly so all extensions stay in sync.
  • on_cmd dispatch. Fine for a single custom command today, but the nested try/except with repeated _set_update_configs_cmd_result(bad, code=-1, message=...) is verbose; a small _error_result(cmd, msg) helper would roughly halve the block.

Logging / security

  • get_request_config() doesn't include auth secrets (those live on params at the top level), so the new info-level log in volcengine_asr_client._send_full_client_request is not a direct secret leak. However, request.corpus.context, hot-word lists, or user-tuned prompts can contain business-sensitive content, and this logs the entire section as JSON on every connection (and every reconnect triggered by update_configs). Consider:
    • logging at debug level, or
    • logging only the keys / a size summary, similar to the params_keys pattern already used in _run_update_configs.
  • _run_update_configs's log correctly logs only params.keys() — good.

Tests

  • Strong coverage for _deep_merge_dict, _strip_empty_request_corpus_context, the omit_empty_text_results gating, and the happy paths of _run_update_configs.
  • Gaps worth filling:
    • The on_cmd JSON-parsing/error branches (missing property, invalid JSON, non-object payload).
    • _run_update_configs with an empty payload (ties into the "always reconnects" point).
    • Confirming has_final_result stays False when an is_final=True result is skipped due to omit_empty_text_results=true, so the end-signal behavior is pinned.

Minor

  • Version jump 0.3.20 → 0.4.3 skips 0.4.00.4.2. Intentional?
  • Two logically independent features (omit-empty + update_configs) are bundled in one PR. Single PR is fine, but separate commits with feat(bytedance-asr): ... scoping would make bisecting cleaner.

Overall this looks solid — the main things I'd push on are avoiding an unnecessary reconnect on no-op payloads and toning down the new request-section log.

@wangyoucao577 wangyoucao577 merged commit 5a0dacd into main May 12, 2026
29 of 34 checks passed
@wangyoucao577 wangyoucao577 deleted the feature/yuyidi/update_bytedance_asr branch May 12, 2026 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants